American Journal of Epidemiology — Latest Matching Preprints

1

Bias from small-count suppression in county-level cancer disparity estimates: a calibrated simulation study

gahan, k.

2026-06-08 epidemiology 10.64898/2026.06.05.26355021 medRxiv

Top 0.1%

38.0%

Show abstract

Abstract Background. Area-level cancer disparities are routinely estimated from public county data in which rates based on small counts (fewer than 16 cases or deaths) are suppressed. Analysts typically drop suppressed counties (complete-case analysis). Because suppression depends on case counts tied to population size and demographic composition, this missingness may be informative, but its effect on the disparity estimate has not, to our knowledge, been quantified. Methods. In a cross-sectional ecological study of 3,143 U.S. counties (analytic sample 3,018 with computable exposure) using one frozen public release of NCI State Cancer Profiles incidence and mortality data and ACS 2018-2022 5-year data, we estimated the most- versus least-deprived ICE(race+income) quintile rate ratio (RR) and rate difference for female breast, stomach, and cervix cancers under four suppression-handling methods: complete-case, available-case, bounding, and model-based small-area estimation. We characterized which counties were erased, and, following the ADEMP framework, ran a Monte Carlo simulation (1,000 replicates per cell; Monte Carlo standard error of bias approximately 0.0025) calibrated to the release to measure bias against a known truth. Analyses were pre-registered. Results. The suppressed fraction rose with rarity: 7.4% of counties for breast, 61.3% for stomach, and 75.7% for cervix incidence. Suppression was concentrated in the most-deprived quintile (cervix, 81.8% suppressed vs 63.8% least-deprived) and overwhelmingly removed rural rather than minority residents (cervix: 81% of the rural but 9% of the minority population erased). For breast (little suppression) the RR was 0.87 (95% CI 0.85-0.89) and identical across methods; for cervix incidence the complete-case RR (1.56) exceeded the model-based estimate (1.50), and for cervix mortality (91% suppressed) complete-case (1.86) exceeded model-based (1.56) by 16% with a wide bounding interval (1.88-2.62). In calibrated simulation, population-weighted complete-case bias was small (less than 2%) at the observed deprivation-county-size correlation and grew with rarity, threshold, and unweighted aggregation; its direction was conditional, becoming positive (over-estimation) as deprived counties became smaller. Conclusions. Complete-case handling of suppressed counties over-estimates rare-cancer area disparities relative to methods that retain them, while silently erasing most of the rural and most-deprived communities the estimate is meant to represent. The effect is negligible for common cancers and grows with rarity. Public-data disparity analyses should report the suppressed fraction and use bounded or model-based estimates by default. Keywords: cancer disparities; small-count suppression; Index of Concentration at the Extremes; informative missingness; small-area estimation; rural health.

2

Direct and mediated effects (DME) SLCMA: a novel method for life course modelling with time-varying covariates

Beer, S.; Simpkin, A. J.; Eldeeb, S. Y.; Zar, H. J.; Stein, D. J.; Dunn, E. C.; Smith, A. D. A. C.

2026-06-06 epidemiology 10.64898/2026.05.29.26354427 medRxiv

Top 0.1%

33.3%

Show abstract

Background: In prospective cohort studies, where an exposure is collected repeatedly, interest often lies in determining whether the timing of that exposure has a differential effect on a later outcome. The Structured Life Course Modeling Approach (SLCMA), where users select between temporal hypotheses of exposure specified a priori, provides one way to analyse such longitudinal data. However, few studies using SLCMA consider the effect of time-varying covariates (TVC) which may impact associations. Methods: We present a modified version of the SLCMA - called direct and mediated effects (DME)-SLCMA - which corrects for TVC. We first develop the DME-SLCMA method, test it through simulation, and apply it to psychosocial data from the Drakenstein Child Health Study (DCHS, n=336) to investigate relationships between maternal psychopathology, TVC of socioeconomic status, and offspring depressive symptoms. Results: We found that, on average, offspring depressive symptoms score increased by 3.9% (95% CI: 1.0%-6.9%, p = 0.039) for each unit of maternal psychopathology (SRQ) at 48 months whilst adjusting for time-varying socioeconomic status (at 18, 30, 42 and 54 months). Our simulations identified several realistic scenarios where selections ignoring TVC - with TVC mediated exposure effects present - were prone to be incorrect, including our DCHS example. Conclusion: DME-SLCMA is a robust new approach for life course modelling in the presence of time-varying covariates. We recommend adjusting for TVC whenever possible, and, when not possible, our simulation study identified that scenarios where mediated effects are comparable, or greater, in magnitude to direct effects are most prone to confounding.

3

Estimating COVID-19 Cumulative Incidence from Seroprevalence Surveys accounting for Time-Varying Seroreversion: A Fully Bayesian Methodology

Owusu-Boaitey, N.; Meyer, M. J.; Herrera-Esposito, D.; Bottcher, L.; Lukz, M.; Cook, S.; Stoto, M. A.; Kraemer, J. D.

2026-06-10 epidemiology 10.64898/2026.06.09.26355264 medRxiv

Top 0.1%

33.0%

Show abstract

Seroprevalence surveys reveal the extent of humoral immunity against pathogens such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and under some circumstances represent cumulative incidence of prior infection. However, antibody waning - or seroreversion - biases these estimates by reducing assay sensitivity in a time-varying manner. Because assay sensitivity decays over time, naively using serosurveys can substantially bias estimates of SARS-CoV-2 cumulative incidence and fatality rates. The Bayesian assay-specific, time-varying sensitivity adjustment developed in this paper can reliably correct for this bias and account for the delay between infection and serosurvey. In seroprevalence studies conducted in the United States in 2020, adjusting for time-varying sensitivity increased cumulative incidence by up to 1.4-fold, with an adjustment of 1.08 for a national study. Our estimates contrast with a previously published 2-fold adjustment that did not account for assay design. This suggests that previous analyses overestimated cumulative incidence by applying seroreversion corrections that did not account for assay-specific effects, or underestimated cumulative incidence by not applying seroreversion corrections. These biases imply fatality rate underestimation and overestimation, respectively. Our model provides a framework for design-specific time-varying sensitivity corrections in seroprevalence surveys for other pathogens.

4

Disentangling infectiousness and susceptibility by age group using transmission pair data: a study of SARS-CoV-2 household transmission

Leung, K. Y.; Miura, F.; Backer, J. A.

2026-06-05 epidemiology 10.64898/2026.06.04.26354892 medRxiv

Top 0.1%

21.5%

Show abstract

Background Differential contributions to transmission across age groups have been reported for many respiratory infections, including SARS-CoV-2. They are crucial for estimating the impact of age-specific interventions. Disentangling these age-dependent contributions remains challenging, as they may reflect differences in contact rates, biological susceptibility, or infectiousness. Aim We aim to jointly estimate age-specific per-contact infectiousness and susceptibility and their effect on the impact of age-specific interventions. Methods The age-specific infectiousness and susceptibility were jointly estimated in a Bayesian framework by combining contact data with transmission pair data (who-infected-whom). We applied this approach to 197,840 self-reported household transmission pairs collected in the Netherlands during the COVID-19 pandemic. Using these estimates, we projected the expected impact of school closure and work-from-home measures during the early stages of an epidemic in the absence of other interventions. Results Both infectiousness and susceptibility to SARS-CoV-2 infection were lowest in children aged 0-9 years and highest in adults over 30 years old, with 2- to 4.5-fold differences between these groups. Projected impacts of age-specific interventions indicated that school closures would reduce the reproduction number by 8% or 29% when age-specific susceptibility and infectiousness were or were not considered, respectively. Conversely, working-from-home policies would lead to reductions of 41% with and 20% without age-specific infectiousness and susceptibility. Conclusion Our method enables robust estimation of age-specific infectiousness and susceptibility. Accounting for these age heterogeneities is essential for projecting the impact of age-targeted interventions. Our approach is adaptable to other respiratory infections and can guide more tailored public health responses.

5

A New Mixed Frequency Regression Model For Environmental Epidemiology

Shukla, N.; Bartington, S. E.; Hansell, A. L.; Lucas, T. C.

2026-06-04 epidemiology 10.64898/2026.06.03.26354801 medRxiv

Top 0.1%

12.4%

Show abstract

Background: In the absence of high-resolution response data, exposure-response modelling often relies on aggregated low-frequency exposure data, leading to loss of high-resolution information. Mixed Data Sampling (MIDAS) from econometrics offers an alternative but is limited due to its inability to make high-resolution predictions, inflexible likelihoods and penalised nonlinear functions, and limited visualization options. We propose a mixed-frequency Distributed Lag Non-linear Model (mf-DLNM) which can eliminate the need to aggregate exposure data in environmental epidemiology and provide high resolution predictions for time series studies. Methods: We evaluated the inference and predictive performance of the mf-DLNM. To evaluate its ability to estimate exposure-response relationships, we applied mf-DLNM and same-frequency (sf)-DLNM using data from the West Midlands, UK. Additionally, we compared the predictive performance of mf-DLNM with sf-DLNM and MIDAS across nine regions of England. As MIDAS cannot predict at the resolution of the predictor (daily), we compared the predictive performance of mf-DLNM and MIDAS at weekly resolution. To test the model's ability to predict high temporal resolution risk (daily), we compared sf-DLNM (with access to daily mortality counts) with mf-DLNM (with access only to weekly mortality counts). Results: In the West Midlands example, mf-DLNM performed comparably to sf-DLNM in estimating daily risk of temperature on respiratory mortality. Furthermore, mf-DLNM and MIDAS exhibited similar performance for weekly predictions. For high-resolution predictions, mf-DLNM and sf-DLNM showed nearly similar performance, despite mf-DLNM having access only to low-resolution response data. Conclusion: This mixed-frequency approach in environmental epidemiology overcomes the limitations of predicting health risks using aggregated exposure data and provides estimates of high-resolution outcomes in the absence of high-frequency health outcome datasets.

6

Modeling the Impact of Pediatric RSV Immunization in Massachusetts, 2024--2025

Jones, L.; Ergas, R.; Tibbs, A.; Russo, E. T.; Norville, J.; Bingay, B.; Brown, C. M.; Reich, N. G.; Pasco, R.

2026-06-10 epidemiology 10.64898/2026.06.05.26354236 medRxiv

Top 0.1%

8.4%

Show abstract

Background Pediatric immunizations for Respiratory Syncytial Virus (RSV), including monoclonal antibodies for infants and vaccines for pregnant people, have become broadly available and can prevent severe RSV outcomes in infants. However, quantifying the impact of RSV immunization in prevention of severe pediatric illness at the population-level is limited by lack of RSV case surveillance data. The Massachusetts Department of Public Health (DPH) conducted a modeling analysis using routine public health surveillance data to estimate the state-level impact of new RSV immunization products on Emergency Department (ED) visits and hospitalizations in Massachusetts for highest risk pediatric groups. Methods A scenario projection tool, called R.Scenario.Vax, was utilized to simulate RSV-associated ED hospital encounters by age group in the context of newly available immunizations. ED visit and hospitalization data from the National Syndromic Surveillance Program (NSSP) during the time period 10/08/2017--10/19/2024 were analyzed, scaled to account for changes in RSV testing practices over time and missing encounter volume in historic data, and utilized to inform model fit of a "typical" RSV season. RSV immunization data from the Massachusetts Immunization Information System (MIIS) for the 2023--2024 and 2024--2025 RSV seasons informed high and moderate pediatric RSV immunization coverage scenarios and their impact was compared to a counterfactual reference scenario of no new immunizations. Median projections were quantitatively and qualitatively compared to observed 2024--2025 season data. Percent reduction in hospital encounters and encounters averted per 10,000 population were calculated for each scenario as compared to the reference. Results Projections for the youngest at-risk age groups showed significantly lower RSV-associated ED visits and hospitalizations during the 2024--2025 season for both high and moderate immunization coverage scenarios. Median projections for infants under 6 months old in the highest coverage scenario, wherein nearly all infants were immunized, showed 72.6% lower ED visits and 73.4% lower hospitalizations when compared to the reference scenario, equating to 262 ED visits and 85 hospitalizations averted per 10,000 population. Conclusions Our results support the use of modeling methods for public health insights and suggest that RSV immunizations for infant populations result in significantly lower RSV-related ED encounters in Massachusetts.

7

Serum Cotinine and Wrist-Worn Ambient Light Exposure Patterns in U.S. Adults: A Cross-Sectional Analysis of NHANES 2011-2014

Wong, A.; Lee, C. W.; Park, A.; Yin, L.; Choi, Y.

2026-06-04 epidemiology 10.64898/2026.06.02.26354759 medRxiv

Top 0.1%

7.5%

Show abstract

Background. Tobacco smoke exposure, quantified by serum cotinine, is associated with cardiovascular, metabolic, and sleep-related health risks. The relationship between biomarker-verified tobacco smoke exposure and objectively measured, free-living wrist-worn ambient light patterns has not been examined in a nationally representative U.S. adult sample. Methods. We analyzed NHANES 2011-2014 cross-sectional data from 6,937 adults aged >20 years with valid serum cotinine and wrist-worn Physical Activity Monitor (PAM) ambient light data. Seven light outcomes were modeled using survey-weighted linear regression with log2(cotinine+1) as the continuous exposure across four covariate adjustment levels. Benjamini-Hochberg false discovery rate (FDR) correction was applied across the 7 outcomes within each model. Results. In Model 2 (adjusted for age, sex, race/ethnicity, education, poverty-income ratio, BMI, and survey cycle; N = 6,350), higher serum cotinine was associated with significantly higher nighttime light (beta = +0.024, 95% CI: 0.010, 0.038; p-FDR = 0.014) and lower evening light (beta = -0.031, 95% CI: -0.055, -0.008; p-FDR = 0.042). In exploratory behavioral models without alcohol (Model 3a; N = 5,766), both nighttime and evening associations remained FDR-significant. After additional adjustment for alcohol, which substantially reduced the sample due to 37.6% missingness (Model 3b; N = 3,866), the nighttime association attenuated below the FDR threshold, while the evening association remained FDR-significant. Categorical analyses showed progressively higher nighttime light across cotinine groups, and a hypothesis-generating sex interaction was identified (p-interaction = 0.001). Conclusions. Higher serum cotinine concentrations were associated with higher nighttime and lower evening ambient light after sociodemographic adjustment. Attenuation after behavioral adjustment and the cross-sectional design preclude causal inference. Longitudinal studies with formal mediation analyses are needed to clarify the temporal ordering and mechanisms linking tobacco smoke exposure, smoking-related behaviors, and personal light-dark cycle patterns.

8

Universal Periodic Review recommendations and trajectories of maternal health between 2005 and 2023: a longitudinal ecological analysis of 89 countries

Uppal, A.; Thomas, R.; De Pasquale, M.; Sillo, J.; Getahun, H.

2026-06-05 public and global health 10.64898/2026.06.03.26354800 medRxiv

Top 0.2%

4.0%

Show abstract

Background: The Universal Periodic Review (UPR) is a peer-review mechanism established to hold UN Member States accountable for human rights including the right to health, yet evidence on its impact on health outcomes is limited. We evaluated whether UPR engagement is associated with accelerated improvements in maternal health trajectories. Methods and Findings: We conducted a longitudinal ecological analysis of 89 countries with a baseline maternal mortality ratio (MMR) of 70 or greater per 100,000 live births in 2005. Outcomes were trajectories of annual MMR, skilled birth attendance (SBA), and contraceptive prevalence rate (CPR), from 2005 to 2023. The exposure was the volume of health-related UPR recommendations received across three cycles, thematically classified using a validated rule-based algorithm. Mixed-effects models adjusted for time-varying GDP per capita and historical fragility. The 89 countries received 41,733 UPR recommendations across three cycles, of which 405 (1%) were related to maternal health. Maternal health recommendations were preferentially directed at countries with higher baseline MMR and lower SBA. After adjustment, each additional maternal health recommendation was associated with a 0.24% [95% confidence interval (CI): 0.08, 0.40] faster annual reduction in MMR, a 0.52% [0.12, 0.91] faster annual gain in the odds of SBA, and a 0.21% [0.09, 0.34] faster annual gain in the odds of CPR. Broader recommendations on women's health and health systems and services were also associated with faster annual improvements in trajectories across all three outcomes; recommendations on abortion, family planning, sexual health and wellbeing, and sexual education tended to be directed towards lower-burden countries and were not associated with differences in any trajectories. It is important to note that the ecological design precludes causal inference. Conclusions: Receiving UPR recommendations on the themes of maternal health, womens health, and health systems and services are associated with accelerated improvements in maternal health trajectories among high-burden countries. These findings suggest that international human rights accountability mechanisms may have a role in supporting national progress on maternal health.

9

Integrated cardiometabolic and nutritional risk profiling identifies pregnancy loss as a marker of systemic metabolic vulnerability

Agarwal, T.; Namburu, J. R.; Kachroo, P.

2026-06-08 epidemiology 10.64898/2026.06.04.26354910 medRxiv

Top 0.3%

3.7%

Show abstract

Background: Pregnancy loss has important implications for womens health. Although maternal age is a well-established risk factor, the contribution of routinely measured cardiometabolic and behavioral markers at population-scale remains incompletely characterized. Objective: To examine associations between cardiometabolic, nutritional, and behavioral risk markers and pregnancy loss among U.S. women of reproductive age. Methods: We conducted a cross-sectional analysis of 4,842 U.S. women aged 20-44 years with [≥]1 pregnancy using the National Health and Nutrition Examination Survey data (2013-2023). Pregnancy loss was defined as [≥]1 prior miscarriages. Exposures included body mass index, smoking exposure (cotinine), lipid biomarkers, vitamin D and folate, and a composite cardiometabolic-nutritional risk score. Survey-weighted logistic regression estimated adjusted odds ratios (aORs) and 95% confidence intervals, with bootstrap resampling for predictor robustness. Results: The weighted prevalence of pregnancy loss was 23%. Higher odds of pregnancy loss were associated with increasing age (aOR per year=1.02; 95% CI: 1.00-1.04), Non-Hispanic Black race (aOR=1.32; 95% CI: 1.00-1.74), overweight (aOR=1.56; 95% CI: 1.16-2.11), obesity (aOR=2.06; 95% CI: 1.39-3.05), and smoking (aOR=1.58; 95% CI: 1.19-2.10). Adverse lipid profiles, particularly elevated triglycerides (aOR=1.83; 95% CI: 1.16-2.90) and high low-density lipoprotein (aOR=2.97; 95% CI: 1.45-6.61), were independently associated with pregnancy loss. Vitamin D/folate were not stable predictors. Higher composite cardiometabolic-nutritional risk scores were observed among women with pregnancy loss (P=0.026). Conclusion: Pregnancy loss clustered with adverse cardiometabolic and behavioral risk markers in a nationally representative population. These findings highlight pregnancy loss as a marker of broader metabolic vulnerability supporting the need for longitudinal studies and cardiometabolic profiling to inform preconception care and risk stratification.

10

A wealth index based on two-component polychoric principal component analysis reduces urban bias and improves socioeconomic classification in low- and middle-income country surveys: a validation study using LSMS surveys

Vidaletti, L. P.; Dos Santos, A. M.; Hellwig, F.; Barros, A. J. D.

2026-06-08 epidemiology 10.64898/2026.06.01.26354245 medRxiv

Top 0.5%

2.4%

Show abstract

Background: The traditional wealth index, based on principal component analysis (PCA), used in the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS), suffers from urban bias, distorting estimates of health inequality. We compared the traditional index (PEAR1) with an alternative two-component polychoric PCA index (POLY2) using annual expenditure from 12 LSMS surveys as the gold standard to determine which provides more accurate SEP measures for equitable policy targeting. Methods: We compared the traditional wealth index (PEAR1) with a two-component polychoric PCA approach (POLY2) using 12 LSMS (Living Standards Measurement Study) surveys (2015-2022) from 12 African countries. Annual household consumption expenditure was the gold standard. We assessed agreement using weighted Cohen's kappa and validated against education (proportion of households with secondary or higher education) using the concentration index (CIX) and slope index of inequality (SII). Results: The POLY2 index showed higher agreement with expenditure quintiles (average national weighted kappa = 43.3%) than the PEAR1 index (35.1%), with notable improvements in urban (43.5% vs. 27.5%) and rural (35.3% vs. 22.4%) areas. POLY2 also attenuated extreme household distributions observed in PEAR1. Education validation showed that POLY2 produced intermediate inequality gradients between the flatter expenditure-based gradient and the steeper PEAR1-based gradient. Conclusion: The POLY2 wealth index is superior to the traditional index, reducing urban-rural bias and providing more accurate socioeconomic classifications. Its adoption in large-scale surveys such as DHS and MICS is recommended to improve equitable monitoring of health inequalities in low- and middle-income countries.

11

Early life multidimensional disadvantage of South Australian children: a whole-population linked data study

Kalamkarian, A.; Pilkington, R. M.; Lynch, J.; Mittinty, M. N.; Malvaso, C.; Hawkins, K.; Pharo, H.; Beck, K.; Chittleborough, C. R.

2026-06-05 epidemiology 10.64898/2026.06.03.26354860 medRxiv

Top 0.5%

2.3%

Show abstract

Background: Whole-population linked administrative data platforms provide an opportunity to generate evidence on early life multidimensional disadvantage to inform resourcing and service provision to families with complex needs. Methods: We used individual-level de-identified data from nine administrative data sources included in the Better Evidence Better Outcomes Linked Data (BEBOLD) platform. The population included all children born in South Australia between 2004-2011 (n=143,083), and their parents. We described the prevalence and distribution of multiple disadvantages affecting children from the 12 months before birth to age 5. Eleven domains of parental disadvantage were created: economic, education, access to services, mental health, substance misuse, smoking during pregnancy, domestic and family violence, health, child protection contact, justice system contact, and death. We investigated the concordance of our measure with an area-level socioeconomic measure used in government reporting. Results: One in two children (48%) were exposed to at least one disadvantage domain, and one in seven (14%) were exposed to three or more domains before age five. Economic disadvantage was most prevalent, affecting one in four (27%) children, of which 75% were exposed to additional forms of disadvantage. Substance misuse, domestic and family violence, and justice system contact were the least likely domains to occur in isolation. Only 54.4% who experienced five or more disadvantage domains were classified in the area-level socioeconomic measure's 'most disadvantaged' quintile. Conclusion: Early life exposure to parental disadvantage can be highly multidimensional. Measurement across different systems is important for informing coordinated service provision for families with complex needs.

12

Ultra-low-field MRI as a tool for measuring brain development in at-risk children in LMICS: feasibility, validity and clinical relevance.

Bradford, L. E.; Ringshaw, J. E.; Malaba, T. R.; Bourke, N. J.; Wedderburn, C. J.; Williams, S. C.; Deoni, S.; Reynolds, H.; Read, J.; Read, L.; Waitt, C.; Mrubata, M.; Stemmet, L.-A.; Davel, L.; Colbers, A.; Wang, D.; Khoo, S.; Myer, L.; Donald, K. A.

2026-06-05 hiv aids 10.64898/2026.06.02.26354785 medRxiv

Top 0.6%

2.0%

Show abstract

Background Children in low- and middle-income countries (LMICs) face an elevated risk of developmental delay, yet scalable neuroimaging tools to study early brain development in these contexts remain limited. Children who are HIV-exposed but uninfected (CHEU) represent a growing population with evidence of language and motor delays and altered brain development compared with children who are HIV-unexposed (CHU). Ultra-low-field (ULF) MRI offers a more affordable alternative to conventional high-field (HF) MRI, but its application in early childhood remains underexplored. Methods We compared brain volumes derived from ULF (64mT) and HF (3T) MRI in South African CHEU and CHU as part of the DolPHIN-2 PLUS study. Volumetric segmentation was performed using FreeSurfer v7.4.1 and SynthSeg on the Flywheel platform. Agreement between modalities was assessed using Pearsons and Lins concordance correlation coefficients across global and subcortical regions. Associations between ULF-derived brain volumes and developmental outcomes, measured by the Bayley Scales of Infant Development, Third Edition, were evaluated using partial correlations adjusted for sex and age. Results Forty-five children (9 CHEU, 36 CHU; mean age 45.6 months) had paired ULF and HF scans of usable quality. Strong correlations were observed between ULF and HF volumes for global white and grey matter regions (r > 0.92) and larger subcortical grey matter structures such as the thalamus, caudate, and putamen (r = 0.86-0.89). Moderate-to-weak correlations were evident in smaller structures (hippocampus, pallidum, amygdala). ULF underestimated most grey matter volumes, and overestimated total white matter volume relative to HF. ULF-derived global and subcortical volumes were associated with receptive and expressive communication (r = 0.34-0.59, all p < 0.05). Conclusions ULF MRI produces brain volume estimates comparable to HF MRI and captures meaningful associations with early language development. These findings support ULF MRI as a feasible and scalable tool for studying neurodevelopment in vulnerable paediatric populations in LMICs.

13

Local Influenza Forecasts Outperform State-Level Forecasts in the United States

Kim, D.; Pasco, R.; Johnson, K. E.; Fox, S. J.; Reich, N. G.; Meyers, L. A.

2026-06-08 infectious diseases 10.64898/2026.06.04.26354836 medRxiv

Top 0.6%

1.9%

Show abstract

Accurate outbreak forecasts are critical for timely and effective public health response. In the United States, however, most forecasts are produced at the state level, which can mask substantial sub-state heterogeneity and limit their utility for local planning. We generated and evaluated forecasts of the percentage of Emergency Department visits attributable to influenza across 173 large metropolitan Health Service Areas (HSAs) using a gradient boosting quantile regression (GBQR) model, and compared their accuracy to forecasts derived from state-level data alone. At a one-week, two-week and three-week horizon, local forecasts outperformed state-based forecasts in 98.8%, 90.8%, and 78.6% of HSAs, respectively, achieving mean weighted interval scores that were on average a 39.2% lower (95% range: 5.9% to 76.7%), 19.6% lower (-6.3% to 59.5%) , and 11.4% lower (-11.7% to 44.9%), respectively. The performance advantage of local forecasting was strongest in HSAs representing a smaller share of their state's population and increased with the proportion of the HSA population living in urban areas and the number of metropolitan areas within a state. These results, based on an analysis of HSAs with populations greater than 250,000, demonstrate that fine-scale modeling can substantially improve forecast accuracy and highlight the potential value of local forecasts for outbreak preparedness and response.

14

Mortality in people with attention-deficit/hyperactivity disorder (ADHD): Examining how risk is embodied in a pooling of two prospective cohort studies

Li, H.; Ford, T.; Warrier, V.; Bell, S.; Batty, G. D.

2026-06-09 epidemiology 10.64898/2026.06.08.26355148 medRxiv

Top 0.7%

1.7%

Show abstract

Background. Nascent findings suggest that people with attention-deficit/hyperactivity disorder (ADHD) experience higher rates of mortality. To date, study samples have been insufficiently well-characterized to examine the mechanisms via which this neurodevelopmental condition elevates mortality risk. Methods. We used data from the 2007 and 2011 waves of the US National Health Interview Survey, a general population-based cohort study comprising 52097 adults (28675 women) aged 18 years or older at baseline. ADHD diagnosis and an array of demographic, socioeconomic, lifestyle, and co-morbidity (somatic and psychiatric) covariates were self-reported. Findings. At baseline, compared with unaffected individuals, participants with ADHD were more likely to be socioeconomically disadvantaged, smoke cigarettes, consume alcohol, and report symptoms of psychological distress. A median 7.75 years of mortality surveillance (range: 7.25-12.25) gave rise to 6597 deaths from all-causes. After adjustment for age, sex, ethnicity, and survey year, ADHD was associated with a markedly elevated risk of death (hazard ratio [95% confidence interval]: 1.58 [1.20-2.09]). Statistical adjustment for socioeconomic circumstances (11% attenuation), physical co-morbidities (15%), and lifestyle factors (17%) had only a modest impact on the ADHD-death gradient, with the greatest explanatory power apparent for symptoms of depression and anxiety (58%). The magnitude of the association of ADHD with mortality was commensurate to that for several well-established risk factors such as poverty (1.66 [1.55-1.78]), hypertension (1.41 [1.32-1.51]), and diabetes (1.71 [1.59-1.85]) but somewhat lower than cigarette smoking (2.51 [2.29-2.76]) after controlling for age, sex, ethnicity, and survey year. Associations between ADHD and cause-specific mortality from cardiovascular disease, cancer, and chronic respiratory disease were inconclusive. Interpretation. In the present study, the influence of ADHD on total mortality appears to be largely embodied via a series of malleable characteristics, particularly mental illness. If confirmed elsewhere, these results raise the possibility that risk factor modification via standard pharmacological and behavioral interventions could help reduce rates of premature mortality in this patient group. Funding. This paper received no direct funding. GDB is supported by the UK Medical Research Council (MR/P023444/1) and the US National Institute on Aging (1R56AG052519-01, 1R01AG052519-01A1).

15

Within-household transmission risk of pulmonary tuberculosis in the era of universal antiretroviral therapy

Khan, P. Y.; Govender, I.; McCreesh, N.; Sithole, M.; Mkwanzai, E.; Sweeney, S.; Ording-Jespersen, G.; Wong, E. B.; Hanekom, W.; Houben, R. M. G. J.; White, R. G. M. G. J.; Smit, T.; Smith, M. J.; Fielding, K.; Grant, A. D.

2026-06-09 epidemiology 10.64898/2026.06.01.26354571 medRxiv

Top 0.8%

1.5%

Show abstract

Background Tuberculosis remains the leading infectious cause of death worldwide. In the WHO African region, declining incidence has coincided with antiretroviral therapy (ART) scale-up, though whether this reflects reduced progression to disease or reduced transmission is unclear. We evaluated how ART and symptom status influence within-household Mycobacterium tuberculosis complex (MTBC) transmission risk. Methods We conducted a case-contact household study in rural South Africa, enrolling index adults with bacteriologically-confirmed pulmonary tuberculosis. MTBC immunoreactivity was measured in all child household contacts (aged 2-14 years) as a proxy measure of within-household transmission. We assessed the influence of index person ART status and symptom status, and explored effect-measure modification of the association between index person HIV status and transmission risk by sex. Results Among 755 child contacts of 296 index persons, effective ART was not associated with within-household MTBC transmission risk (risk ratio [RR], 1.07; 95% CI, 0.66-1.74). Among PLHIV engaged in ART care, WHO TB four-symptom screen (WHO4SS) status was not associated with transmission risk (RR, 0.80; 95% CI, 0.43-1.47), although absence of reported cough reduced risk (RR, 0.61; 95% CI, 0.38-0.96). A pronounced interaction between sex and HIV status was observed: HIV-negative women had the highest within-household MTBC transmission risk (30.5% vs. 14.3% in women with HIV) whereas risks were similar between HIV-positive and HIV-negative men. Conclusions We found no evidence that effective ART or WHO4SS status influenced within-household MTBC transmission risk, though confidence intervals were wide. Absence of reported cough was associated with lower risk, and transmission risk was highest among child contacts of HIV-negative women. These findings suggest reported cough is a useful marker of transmission risk and that routine tuberculosis screening within ART care may reduce transmission from PLHIV; intensified efforts are nonetheless needed to achieve earlier tuberculosis detection in HIV-negative individuals.

16

A Decade of the Center for Disease Control and Prevention's FluSight Influenza Forecasting

Hines, A. G.; Mathis, S. M.; Johansson, M. A.; Biggerstaff, M.; Reed, C.; Borchering, R.

2026-06-08 epidemiology 10.64898/2026.06.05.26354941 medRxiv

Top 0.9%

1.3%

Show abstract

Since the U.S. 2013/14 influenza season, the CDC's FluSight Challenge has provided a platform for evaluating influenza forecasting models and fostering collaboration across institutions. The Challenge aims to improve the science and enhance the utility of infectious disease forecasts for public health decision making. We analyzed ten years of submitted forecasts (2014/15-2019/20 (influenza-like illness seasons) and 2021/22-2024/25 (hospital admissions seasons)) across a range of model types, including statistical, mechanistic, machine learning, and hybrid models. Influenza-like illness (ILI) forecasts were evaluated using the exponentiated logarithmic score (skill metric) while hospital admissions forecasts were evaluated using the log transformed relative Weighted Interval Score. Corresponding potential performance differences were assessed using Wilcoxon rank-sum tests, and associations with team participation history were evaluated using Spearman's rank correlation. Model performance varied by season, and no single model type consistently outperformed others. In ILI seasons, statistical models generally performed better than mechanistic and machine learning models, though consistent differences were not observed in more recent hospital admissions seasons. Ensemble forecasts showed better overall performance across seasons, and the CDC's FluSight ensemble ranked among the top-performing forecasts every year. We also found a positive correlation between forecast accuracy and the number of years a team participated in the Challenge, with statistically significant associations in four seasons. These findings highlight the benefits of ensemble approaches and sustained engagement in improving forecasting performance, while also underscoring the continued value of forecast evaluation before and following the COVID-19 pandemic. Insights from the FluSight Challenge can guide future infectious disease forecasting efforts and support more effective public health preparedness.

17

Spatial and temporal associations between animal ownership and malaria prevalence in Africa using cross-sectional national Demographic and Health Surveys

Topazian, H. M.; Morgan, C. E.; Goel, V.

2026-06-08 epidemiology 10.64898/2026.06.05.26355017 medRxiv

Top 1.0%

1.2%

Show abstract

Use of zooprophylaxis as a malaria control strategy has been recommended historically, but a complex relationship exists between animal ownership and malaria infection, with mixed associations described in the literature. We sought to characterize this relationship spatially and temporally in malaria-endemic regions of Africa. We used data from 392,843 individuals from 66 Demographic and Health surveys from countries within Africa to investigate the association between household animal ownership and Plasmodium infection. We used Bayesian models with Integrated Nested Laplace Approximation to incorporate spatially varying coefficient processes, allowing the association of interest to vary over space, time, and within strata of vector species occurrence, land cover, and number of animals owned by households. Spatially varying intercept models showed that ownership of cattle, chickens/poultry, goats, horses/donkeys/mules, pigs, and sheep was broadly associated with malaria infection, with odds ratios ranging from 1.55 to 1.67. However, spatially varying slope models revealed considerable heterogeneity, with odds ratio estimates for all animal types demonstrating both protective and harmful effects varying from 0.33 to 3.33 both subnationally and across time. We found no evidence that modification by vector species, number of animals owned, and land cover fully explained the variation in estimates. Unobserved localized cultural, behavioral, or ecological factors likely modify the association between animal ownership and malaria prevalence. Further exploring the nature of this relationship over space and time will be important to understanding how context-specific One Health dynamics between humans, animals and the environment affect malaria prevention and control efforts.

18

Estimating Infectious Disease Importation Risk during the 2026 FIFA World Cup

Herrera-Diestra, J. L.; Bi, K.; Ptak, S.; Ertem, Z.; Al-amery, A.; Harris, M.; Meyers, L. A.

2026-06-04 public and global health 10.64898/2026.06.03.26354828 medRxiv

Top 1%

1.2%

Show abstract

Background. The 2026 FIFA World Cup will bring an estimated 1--5~million international visitors to 11~US host cities between June~11 and July~19, 2026---the largest tournament in history. Large-scale international gatherings accelerate importation of infectious diseases from diverse source populations. Advance estimation of importation risk is essential for public health preparedness and surveillance prioritization. Methods. We developed a Poisson importation framework applied to five diseases (dengue fever, influenza, malaria, measles, and pertussis) across the 11~US venue cities. Three nested travel models of increasing resolution were constructed: a baseline model using routine June~2024 arrival data; a World Cup--adjusted model incorporating projected visitor growth factors; and a schedule-driven model routing WC fans to specific cities based on match assignments. WHO incidence and BTS T-100 routing fractions were combined with Monte Carlo uncertainty propagation (5,000 Uniform draws on under-reporting and travel-while-infectious parameters) to yield median importation estimates with 95\% uncertainty intervals. Results. Dengue posed the highest importation risk at most venue cities under the schedule-driven model (median $\Lambda > 10$ expected importations from Brazil alone; 95\% uncertainty interval 5.9--33.1), robust across the full literature-supported parameter range; Atlanta was the exception, where malaria probability exceeded dengue, driven by direct travel from West and Central African nations. Influenza ranked second at most cities, coinciding with the Southern Hemisphere winter peak. Pertussis showed broad geographic spread but carries the widest relative uncertainty, as the assumed detection rate sits at the upper bound of the literature range. Background tourism accounted for the dominant share of total importation risk; the World Cup fan increment contributed approximately 8.3\% of projected arrivals for WC-qualified nations. Conclusions. This Poisson importation framework, built entirely from publicly available data, provides reproducible importation risk estimates for mass gathering events. The framework extends to additional diseases, cities, and gatherings, offering a transparent baseline complementary to proprietary modeling systems.

19

EMOD with Full Parasite Genetics: A modeling framework for evaluating parasite genetic metrics for operational malaria molecular surveillance

Ribado, J. V.; Suresh, J.; Bridenbecker, D.; Russell, J. R.; Lee, A.; Wenger, E.; Chabot-Couture, G.; Proctor, J. L.; Battle, K. E.; Bever, C. A.

2026-06-08 public and global health 10.64898/2026.06.05.26355027 medRxiv

Top 1%

1.2%

Show abstract

Malaria molecular surveillance (MMS) is becoming increasingly common in endemic settings and has been proposed as a tool for monitoring parasite transmission to inform programmatic decision-making. However, the conditions under which parasite genetic metrics provide interpretable signals for broader use cases, such as assessing intervention impacts and detecting importation, remain under-characterized. We present EMOD with Full Parasite Genetics (FPG), a simulation framework designed to explore how parasite genetic metrics arise from transmission, intervention, importation, and sampling processes at programmatically relevant timescales. Using seasonal scenarios across a range of transmission intensities, we demonstrate three principal findings. First, genetic metrics can detect insecticide-treated net intervention impacts at seasonal and yearly timescales, but the strength, timing, and form of the relationship between genetic and epidemiological measures vary by metric and sampling timing. Second, importation can break the expected relationship between parasite genetic diversity from local transmission intensity at very low incidence, allowing low-transmission settings with substantial importation to maintain elevated diversity metrics. Third, convenience sampling practices, including sample size, collection timing, and the clinical composition of sampled populations, introduce non-random biases in genetic metric estimation in a way that obscures the true transmission signal. Together, these findings show that parasite genetic metrics can support operational surveillance, but that their interpretation depends on transmission context, importation, metric choice, and sampling design. EMOD FPG provides a framework for evaluating these dependencies in future setting-specific analyses and for guiding the interpretation of parasite genetic data across sites and over time.

20

Cultural engagement and mental disorders: A prospective negative control analysis of the English Longitudinal Study of Ageing with linked Hospital Episode Statistics

Qin, P.; Steptoe, A.; Fancourt, D.

2026-06-08 epidemiology 10.64898/2026.06.05.26354991 medRxiv

Top 1%

1.1%

Show abstract

Cultural engagement is associated longitudinally with better mental health and reduced depression incidence, but evidence has largely relied on self-reported symptoms and diagnoses, leaving uncertainty about clinically recorded disorders, and residual confounding remains a concern. Here, we examined whether cultural engagement (including going to cinemas, museums, galleries, exhibitions, theatre, concerts, or opera) predicts hospital-treated mental disorders in 8,274 adults aged 50 years or older from the English Longitudinal Study of Ageing. Participant records were linked to ICD-10 diagnoses in Hospital Episode Statistics and mortality records with follow-up of up to 20 years. In fully adjusted Cox models accounting for sociodemographic, lifestyle, and social factors and multiple testing, frequent cultural engagement was associated with lower risk of any mental disorders (HR 0.71, 95% CI 0.62-0.82, FDR adjusted P value<0.001), dementia (0.71, 0.56-0.89, FDR adjusted P value=0.010), substance misuse (0.75, 0.59-0.95,FDR adjusted P value=0.040), and mood disorders (0.73, 0.56-0.95, FDR adjusted P value=0.044), but not neurotic disorders. Associations persisted after excluding early incident cases and adjusting for baseline depressive symptoms and cognition, and showed robustness to unmeasured confounders. To further probe causality, eye disease, ear disease, and traumatic brain injury, which share similar socio-demographic profiles to mental disorders, were prespecified as negative control outcomes. Cultural engagement was not associated with any negative control outcomes. These findings provide triangulated statistical data to suggest that cultural engagement is associated with reduced risk of several clinically recorded mental disorders and support further testing of cultural engagement as a population mental health strategy.